Empirical Risk Minimization: Probabilistic Complexity and Stepsize Strategy

نویسندگان

  • Chin Pang
  • Panos Parpas
  • Chin Pang Ho
چکیده

Empirical risk minimization (ERM) is recognized as a special form in standard convex optimization. When using a first order method, the Lipschitz constant of the empirical risk plays a crucial role in the convergence analysis and stepsize strategies for these problems. We derive the probabilistic bounds for such Lipschitz constants using random matrix theory. We show that, on average, the Lipschitz constant is bounded by the ratio of the dimension of the problem to the amount of training data. We use our results to develop a new stepsize strategy for first order methods. The proposed algorithm, Probabilistic Upper-bound Guided stepsize strategy (PUG), outperforms the regular stepsize strategies with strong theoretical guarantee on its performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning

Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. They are used ubiquitously in computational linguistics. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of probabilistic grammars using the log-loss. We derive sample complexity bounds in this framework that apply both to the...

متن کامل

Rademacher penalties and structural risk minimization

We suggest a penalty function to be used in various problems of structural risk minimization. This penalty is data dependent and is based on the sup-norm of the so called Rademacher process indexed by the underlying class of functions (sets). The standard complexity penalties, used in learning problems and based on the VCdimensions of the classes, are conservative upper bounds (in a probabilist...

متن کامل

Asynchronous Parallel Algorithms for Nonconvex Big-Data Optimization Part II: Complexity and Numerical Results

We present complexity and numerical results for a new asynchronous parallel algorithmic method for the minimization of the sum of a smooth nonconvex function and a convex nonsmooth regularizer, subject to both convex and nonconvex constraints. The proposed method hinges on successive convex approximation techniques and a novel probabilistic model that captures key elements of modern computation...

متن کامل

Online Learning Via Regularized Frequent Directions

Online Newton step algorithms usually achieve good performance with less training samples than first order methods, but require higher space and time complexity in each iteration. In this paper, we develop a new sketching strategy called regularized frequent direction (RFD) to improve the performance of online Newton algorithms. Unlike the standard frequent direction (FD) which only maintains a...

متن کامل

Empirical Risk Minimization with Approximations of Probabilistic Grammars

When approximating a family of probabilistic grammars, it is convenient to assume the degree of the grammar is limited. We limit the degree of the grammar by making the assumption that Nk ≤ 2. This assumption may seem, at first glance, somewhat restrictive, but we show next that for probabilistic context-free grammars (and as a consequence, other formalisms), this assumption does not restrict g...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016